[Lazy evaluation] Pytato Array Context with transformations#248
[Lazy evaluation] Pytato Array Context with transformations#248kaushikcfd wants to merge 3 commits intoinducer:mainfrom
Conversation
|
|
||
|
|
||
| @for_each_kernel | ||
| def _single_grid_work_group_transform(kernel, cl_device): |
There was a problem hiding this comment.
@inducer said in a personal meeting that this logic should also handle reductions kernels.
inducer
left a comment
There was a problem hiding this comment.
Thanks! A few quick drive-by thoughts below.
| kernel = lp.split_iname(kernel, iname, | ||
| ngroups * l_zero_size * l_one_size) | ||
| kernel = lp.split_iname(kernel, f"{iname}_inner", | ||
| l_zero_size, inner_tag="l.0") | ||
| kernel = lp.split_iname(kernel, f"{iname}_inner_outer", | ||
| l_one_size, inner_tag="l.1", | ||
| outer_tag="g.0") |
There was a problem hiding this comment.
Doesn't this assume that each iname is only used by exactly one statement?
| if len(insn.within_inames) == 0: | ||
| continue | ||
|
|
||
| if len(insn.within_inames) == 1: |
There was a problem hiding this comment.
- Which type of kernels are those?
- I would like it if this were more guided by metadata, along the lines of what the pyopencl actx does.
There was a problem hiding this comment.
We could borrow the unifier from FusionContractorArrayContext, but that would add a dependency on #284 and inducer/pytato#224. If those go in before this PR, I will put that here.
| t_unit = _single_grid_work_group_transform(t_unit, self.queue.device) | ||
| t_unit = lp.set_options(t_unit, "insert_gbarriers") | ||
| t_unit = lp.linearize(lp.preprocess_kernel(t_unit)) | ||
| t_unit = _alias_global_temporaries(t_unit) |
There was a problem hiding this comment.
My recollection of our discussion was that we'd do this without aliasing... am I remembering wrong? If not, what made you change your mind?
There was a problem hiding this comment.
I thought we wouldn't alias at the pytato's CodeGenMapper stage, but post-linearization grabbing hold of dead-temporaries would be trivial. I.e. I had thought we would alias the global temporaries not at the pytato-level but downstream as loopy transformation.
|
When trying this on the mirgecom examples, I'm getting this type of error: |
|
@matthiasdiener: Sorry, the transformations in this branch depend on some PRs on the pytato/loopy/arraycontext end. I've edited the description to record that. |
32a7d59 to
bd964da
Compare
c166b2f to
0c55627
Compare
9fb8370 to
8c65687
Compare
824caee to
405bb8b
Compare
32b0a78 to
0dcc37c
Compare
0dcc37c to
6bfe6af
Compare
|
Closing this as inducer/arraycontext#216 provides a cleaner implementation. |
Draft because:
ReturnFromKernelat a point in the linearized kernel loopy#476Related/linked PRs: